This paper proposes a note-based music language model (MLM) for improving note-level\npolyphonic piano transcription. The MLM is based on the recurrent structure, which could model the\ntemporal correlations between notes in music sequences. To combine the outputs of the note-based\nMLM and acoustic model directly, an integrated architecture is adopted in this paper. We also\npropose an inference algorithm, in which the note-based MLM is used to predict notes at the blank\nonsets in the thresholding transcription results. The experimental results show that the proposed\ninference algorithm improves the performance of note-level transcription. We also observe that\nthe combination of the restricted Boltzmann machine (RBM) and recurrent structure outperforms\na single recurrent neural network (RNN) or long short-term memory network (LSTM) in modeling\nthe high-dimensional note sequences. Among all the MLMs, LSTM-RBM helps the system yield the\nbest results on all evaluation metrics regardless of the performance of acoustic models.
Loading....